The Cost, Value, and Outcome of U.S. Higher Education¶

DS5610 Final Project¶

University Education

Group Members:

  • Julia [Last Name]: Cost Analysis (Entry)
  • Israel Lwamba: ROI & Value Analysis (Value)
  • Jackson [Last Name]: Career Outcomes Analysis (Exit)

¶

Date: December 10, 2025

Note: This report contains interactive visualizations. Please hover over the maps and charts to view specific state and financial values.

All libraries imported successfully.
⚠️ Warning: Julia's file not found. Please check filename.
Israel's ROI Data Loaded.
⚠️ Warning: Jack's file not found. Please check filename.

1. Introduction¶

Higher education in the United States is often viewed as a golden ticket to financial stability, but the price of admission has skyrocketed. This project analyzes the lifecycle of a college degree, tracking the journey from the initial financial burden (Tuition & Cost) to the long-term financial payoff (Return on Investment) and finally to the realizable labor market outcomes (Salary & Career). By integrating federal IPEDS data, Census records, and ROI projections, we aim to determine where the "breakeven" points lie and which educational paths offer the highest tangible value.

2. Defining Key Terms¶

To ensure clarity across our financial and educational analysis, we define the following key metrics:

  • Net Present Value (NPV): The difference between the present value of cash inflows (earnings) and the present value of cash outflows (tuition/costs) over a period of time. In our context, this represents the 40-year ROI of a degree.
  • Cost of Attendance (COA): The total estimated cost of attending college for one year, including tuition, fees, housing, and food.
  • Return on Investment (ROI): A performance measure used to evaluate the efficiency of an investment. We use this to compare the long-term earnings of graduates against their initial college costs.
  • IPEDS: The Integrated Postsecondary Education Data System, the core federal database for U.S. college statistics.

3. Motivation¶

As students ourselves, we are acutely aware of the "student debt crisis" narrative. However, looking at debt in isolation paints an incomplete picture. We were motivated to investigate whether high costs are justified by high returns. Does attending a college in an expensive state like Massachusetts actually pay off in the long run? By connecting cost data with salary outcomes, we hope to move the conversation from "College is too expensive" to "Which college paths offer the best value?"

4. Data Cleaning and Preparation¶

Our analysis relies on three distinct datasets, each requiring specific cleaning procedures to ensure accuracy and compatibility. Below, we detail the cleaning methodology for each section of the analysis.

4.1 Cost Data Cleaning (Julia)¶

The IPEDS datasets used in this analysis underwent a systematic cleaning and preparation process to ensure consistency, reliability, and analytic readiness. Two files were integrated: the 2024 DRVIC dataset, which contains institution-level tuition and cost-of-attendance information, and the 2024 HD dataset, which provides school characteristics including state identifiers.

Integration & Filtering:

  • After importing both files, all column names were trimmed for whitespace to avoid unintended merge mismatches.
  • Only variables relevant to tuition and attendance costs were retained from the DRVIC file—specifically four tuition fields (TUFEYR0–TUFEYR3) and four cost-of-attendance fields (CINDON, CINSON, CINDOFF, CINDFAM).
  • From the HD dataset, only UNITID and STABBR were kept.
  • The two datasets were merged on UNITID to create a unified institution-level dataset.

Handling Inconsistencies: Several numeric fields contained formatting inconsistencies or non-numeric entries[cite: 1074]. To address this, all tuition and cost variables were converted to numeric using coercion, ensuring invalid entries were transformed into missing values rather than causing errors during analysis. Institutions missing state information (STABBR) were removed, since generating state-level summaries requires a valid geographic identifier.

Aggregation: To create interpretable metrics, institution-level averages were constructed for both tuition and total cost of attendance by taking the mean across their respective four scenarios. The dataset was then aggregated to the state level by computing average tuition and average cost of attendance for each state. State abbreviations were standardized, stripped of whitespace, and filtered to include only valid U.S. states and the District of Columbia.

Scraped Data Preparation: A second dataset—statewide tuition figures scraped from CollegeTuitionCompare.com—required additional preparation due to its multi-level column headers. The MultiIndex columns were flattened into descriptive variables such as public_in_state, public_out_state, and private. All monetary values included formatting symbols such as “$” and commas, which were removed before converting the fields to numeric types. Missing or non-numeric entries were coerced to null values.

4.2 ROI Data Cleaning¶

We sourced Return on Investment (ROI) data from the Foundation for Research on Equal Opportunity, which calculates the 40-year Net Present Value (NPV) of degrees. The raw dataset contained granular "Field of Study" entries (e.g., "Molecular Biology," "Zoology") that required grouping for meaningful analysis.

  • Text Processing: We removed currency symbols ($) and commas from the ROI columns and converted them to numerical floats.
  • Categorization: We applied a categorization logic to group thousands of specific majors into 7 broad fields: STEM, Healthcare, Business, Education, Arts & Humanities, Social Sciences, and Other.
  • Aggregation: We grouped the data by State (for the map) and by Broad Major (for the bar chart) to calculate mean ROI values.

4.3 Career Data Cleaning (Jackson)¶

Sourced from the US Department of Education, we filtered the College Scorecard dataset to focus on bachelor's degree recipients.

  • Filtering: We isolated data points relevant to entry-level earnings and debt loads.
  • Cleaning: Salary fields were cleaned of null values and standardized to allow for state-level salary vs. tuition comparisons.

5. Current Cost of College in 2024–2025¶

This section presents a comprehensive overview of the current cost structure of U.S. higher education using several complementary visual summaries:

  1. State-level averages of tuition and total cost of attendance.
  2. National averages by institution type.
  3. The distribution of the tuition-to-cost ratio.
  4. Paired state maps of tuition and total cost.

Together, these figures characterize both the level and composition of college costs in the 2024–2025 period and provide a basis for understanding how affordability varies across space, sectors, and cost categories.

The analysis proceeds in four steps:

  • Section 5.1 examines the spatial distribution of tuition and total cost of attendance across states.
  • Section 5.2 focuses on structural differences across institution types, highlighting the divide between public and private institutions.
  • Section 5.3 studies the relative weight of tuition within overall cost using the tuition-to-cost ratio.
  • Section 5.4 discusses structural factors that help explain the observed disparities[cite: 1146, 1147, 1148, 1149, 1150, 1151, 1152].

6. Analysis Part I: The Rising Cost of College¶

Analysis by Julia

We begin by examining the entry barrier to higher education: the cost. This section analyzes the spatial distribution of tuition across the U.S. and dissects the financial burden placed on students at public versus private institutions.

6.1 Spatial distribution of tuition and total cost of attendance across states.¶

The paired state maps of average tuition and total cost of attendance reveal pronounced spatial heterogeneity in the financial burden associated with college.

In the tuition map, most states in the South and Midwest fall into a relatively low to moderate range, with average annual tuition typically between roughly $8,000 and $15,000. States such as Texas, Georgia, Tennessee, and Missouri are representative of this pattern and form a broad “lower-tuition belt” across the interior of the country.

By contrast, several Northeastern states occupy the upper end of the tuition distribution. Massachusetts, Connecticut, and New York are consistently shaded in the highest categories, with average tuition levels rising into the $30,000–$38,000 range. These states combine higher public-sector tuition with a dense presence of expensive private institutions, contributing to substantially higher average prices than in most other regions.

The map of total cost of attendance, which incorporates tuition, housing, food, transportation, and other living expenses, amplifies these regional differences.

  • In many Southern and Midwestern states, total cost remains in the $25,000–$35,000 range, even after accounting for living expenses.
  • In the Northeast, however, several states exceed $45,000, and some approach or surpass $50,000 per year in total cost.
  • Coastal states in the West, such as California, show a related but distinct pattern: tuition is often in the mid-range, but high housing and urban living costs push total cost of attendance into the upper tiers.

Taken together, the two maps indicate that the financial burden of college is strongly shaped by geography. Some states create relatively low-cost environments through a combination of moderate tuition and manageable living expenses, while others generate a “double burden” of high tuition and high cost of living. This spatial structure forms the backdrop for any discussion of college affordability and suggests that where a student studies can be as important as the nominal price of the institution.

6.2 Tuition structure across institution types¶

The bar chart of average tuition by institution type summarizes a second major dimension of cost variation: structural differences across sectors. The national averages display a clear multi-tiered price hierarchy.

  • Top Tier: Private four-year institutions charge an average of roughly $28,500 in tuition and fees.
  • Middle Tier: Public four-year institutions for out-of-state students form the next tier, with average tuition around $19,500.
  • Lower Tier: Public four-year institutions for in-state students charge substantially less, at approximately $9,100.
  • Entry Tier: Community colleges occupy the bottom tier, with in-state tuition near $4,900.

This pattern reflects a strongly institutionalized price structure. [cite_start]The gap between public in-state and public out-of-state tuition is particularly notable: out-of-state students pay a premium of more than twice the in-state amount [cite: 89-90]. This premium is consistent with the design of state-funded higher education, in which resident students benefit from subsidies while non-residents are charged closer to the full cost.

The resulting hierarchy implies a stratified market where students with different financial constraints are effectively channeled into different portions of the higher-education system.

6.3 Relative weight of tuition versus living costs¶

The distribution of the tuition-to-cost ratio provides additional insight into how tuition fits within the broader cost structure of attending college. The histogram shows a right-skewed pattern with a concentration of institutions in the range of approximately 0.25 to 0.55.

In other words, for a large share of institutions, tuition accounts for roughly one quarter to one half of the total cost of attendance, with the remainder attributable to living expenses and other non-tuition costs.

  • Low Ratios (< 0.15): correspond to institutions in high cost-of-living environments where housing and everyday expenses dominate the student budget. In these settings, moderate tuition can coexist with very high total costs.
  • High Ratios (> 0.80): are typically situated in lower-cost areas, where tuition makes up the bulk of total cost because non-tuition expenses are comparatively modest.

This reinforces the conclusion that affordability cannot be inferred from tuition alone. Understanding the true financial burden requires considering how tuition interacts with local living costs.

6.4 Structural factors underlying cost variation¶

The patterns documented above reflect deeper structural features of the U.S. higher-education system and its broader economic context. Several categories of factors are particularly salient:

  1. State Funding Regimes: States that maintain robust support for public higher education tend to exhibit lower in-state tuition, while those with reduced appropriations rely more heavily on tuition revenue.
  2. Institutional Characteristics: Institutions offering resource-intensive programs (e.g., engineering, health professions) or emphasizing small class sizes typically face higher per-student costs.
  3. Instructional Modality: Fully online institutions can operate with lower physical infrastructure costs, reducing both tuition and non-tuition expenses compared to residential universities.
  4. Student-Level Factors: Military benefits, employer assistance, and institutional grants often mean that the "effective price" paid by a student is lower than the "headline price" analyzed here.

Overall, the cost of college in 2024–2025 is the outcome of interacting geographic, institutional, and policy forces.

7. Analysis Part II: Return on Investment (ROI)¶

Having established the cost, we now turn to the value. Is the significant financial investment worth it?

This section maps the 40-year Net Present Value (NPV) of a degree by state and categorizes the financial yield of different majors. We aim to identify the "High Cost, High Reward" states and quantify the long-term financial gap between STEM/Healthcare majors and Arts/Education majors.

ROI Analysis Data Prepared.

7.1 The Geography of Value¶

The map above reveals a distinct "High Cost, High Reward" dynamic. States identified in Section 5 as having the highest costs of attendance—specifically Massachusetts (MA) and California (CA)—also appear in the top tier for ROI, with average returns exceeding $500,000 over a graduate's career. Washington D.C. leads the nation with an average ROI of nearly $679,000. This suggests that while the barrier to entry (tuition) in these coastal regions is high, the labor market outcomes often justify the expense.

7.2 Value by Field of Study¶

As shown in the bar chart, the choice of major is a primary driver of financial success. Healthcare and STEM degrees dominate, with average 40-year returns approaching $702,000 and $586,000 respectively. In contrast, Education majors average approximately $111,000, creating a "Value Gap" where low-cost degrees may yield significantly lower long-term financial stability.

8. Analysis Part III: Career Outcomes & Salary¶

Analysis by Jackson

Finally, we analyze the exit outcomes. Beyond abstract ROI numbers, what do starting salaries and job market projections look like? This section connects the cost and value analysis to the tangible reality of the labor market.

==========================================¶

PART 3: CAREER ANALYSIS (JACKSON)¶

==========================================¶

TODO: CODE HERE¶

9. Limitations¶

While our analysis offers broad insights, we acknowledge several limitations:

  1. State Averages vs. Individual Reality: Aggregating data by state hides rural/urban divides (e.g., earnings in Nashville vs. rural Tennessee).
  2. Assumption of Completion: Our ROI calculations assume on-time graduation. They do not account for students who drop out with debt but no degree.
  3. Inflation Adjustments: Comparing historical tuition data requires precise inflation adjustments, which may vary by sector (CPI vs. HEPI).

10. Future Work¶

Future iterations of this project could incorporate institution-level granularity to compare public vs. private ROI within the same state. Additionally, adding a temporal dimension to the ROI analysis could show how the "Value Gap" between STEM and Humanities has evolved over the last decade.

11. Conclusion¶

Our analysis demonstrates that the "affordability" of a degree cannot be measured by tuition alone. While costs are rising, high-cost states often provide the highest returns, provided students select high-growth majors like STEM or Healthcare. The most dangerous educational path appears to be high-cost institutions coupled with low-ROI majors, where the break-even point may never be reached.

12. References¶

  1. IPEDS (Integrated Postsecondary Education Data System): nces.ed.gov/ipeds
  2. US Census Bureau (ACS Table S1501): data.census.gov
  3. Foundation for Research on Equal Opportunity (ROI Data): freopp.org
  4. College Tuition Compare: collegetuitioncompare.com